Data Mapper: An Operator for Expressing One-to-Many Data Transformations

نویسندگان

  • Paulo Carreira
  • Helena Galhardas
  • João L. M. Pereira
  • Antónia Lopes
چکیده

Transforming data is a fundamental operation in application scenarios involving data integration, legacy data migration, data cleaning, and extract-transform-load processes. Data transformations are often implemented as relational queries that aim at leveraging the optimization capabilities of most RDBMSs. However, relational query languages like SQL are not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. This class of data transformations is required for solving the data heterogeneities that occur when source data represents an aggregation of target data. In this paper, we propose and formally define the data mapper operator as an extension of the relational algebra to address one-to-many data transformations. We supply an algebraic rewriting technique that enables the optimization of data transformation expressions that combine filters expressed as standard relational operators with mappers. Furthermore, we identify the two main factors that influence the expected optimization gains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending Relational Algebra to express one-to-many data transformations

Application scenarios such as legacy-data migration, ETL processes, data cleaning and data-integration require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these solutions is easily m...

متن کامل

One-to-many data transformations through data mappers

The optimization capabilities of RDBMSs are turning them attractive for executing data transformations. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produce several output tuples for a single input tuple cannot be expressed in that way. To overcome this limitation, we propose to extend Rel...

متن کامل

Extending the Relational Algebra with the Mapper Operator

Application scenarios such as legacy data migration, Extract-TransformLoad (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily main...

متن کامل

A Characterization of the Entropy--Gibbs Transformations

Let h be a finite dimensional complex Hilbert space, b(h)+ be the set of all positive semi-definite operators on h and Phi is a (not necessarily linear) unital map of B(H) + preserving the Entropy-Gibbs transformation. Then there exists either a unitary or an anti-unitary operator U on H such that Phi(A) = UAU* for any B(H) +. Thermodynamics, a branch of physics that is concerned with the study...

متن کامل

An Authorization Framework for Database Systems

Today, data plays an essential role in all levels of human life, from personal cell phones to medical, educational, military and government agencies. In such circumstances, the rate of cyber-attacks is also increasing. According to official reports, data breaches exposed 4.1 billion records in the first half of 2019. An information system consists of several components, which one of the most im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005